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Administrivia 

• Problem set 2: working on it? 

• Today: Really using homogeneous systems to 
represent projection. And how to do calibration. 

• Reading: Forsyth and Ponce, 1.2 and 1.3 
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Last time... 



CS 4495 Computer Vision -A. Bobick 


Calibration and Projective Geometry 1 


What is an image? 

■ Last time: a function - a 2D pattern of intensity values 

■ This time: a 2D projection of 3D points 



Figure from US Navy Manual of Basic Optics and Optical 
Instruments, prepared by Bureau of Naval Personnel. Reprinted 
by Dover Publications, Inc., 1969. 
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Modeling projection 

y 



• The coordinate system 

• We will use the pin-hole model as an approximation 
Put the optical center (Center Of Projection) at the origin 
Put the image plane (Projection Plane) in front of the COP 
Why? 

The camera looks down the negative z axis 
we need this if we want right-handed-coordinates 
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Modeling projection 

• Projection equations 

Compute intersection with PP of 
ray from (x,y,z) to COP 
Derived using similar triangles 


x y 

—d—, —d) 

z z 

• We get the projection by 
throwing out the last 
coordinate: 

(x, y, z) -s- (~d~, - d -) 
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Distant objects 
are smaller 

y 
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Or... 

• Assuming a positive focal length, and keeping z the 
distance: 


, r X 

X =11= f -j— 7 

z 


y 


= V = 




CS 4495 Computer Vision -A. Bobick 


Calibration and Projective Geometry 1 


Homogeneous coordinates 

• Is this a linear transformation? 

• No - division by Z is non-linear 

Trick: add one more coordinate: 


( x,y ) 

X 

y 

(x,y,z) => 

X 

y 

z 


1 


1 


homogeneous image (2D) homogeneous scene (3D) 
coordinates coordinates 


Converting from homogeneous coordinates 
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(x/w, y/w) 
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( x/w , y/w, z/w) 


Homogenous coordinates invariant under scale 
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Perspective Projection 


Projection is a matrix multiply using homogeneous 
coordinates: 
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This is known as perspective projection 

• The matrix is the projection matrix 

• The matrix is only defined up to a scale 


S. Seitz 
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Geometric Camera calibration 

Use the camera to tell you things about the world: 

Relationship between coordinates in the world and coordinates in 
the image: geometric camera calibration, see Forsyth and Ponce, 
1.2 and 1.3. Also, Szeliski section 5.2, 5.3 for references 

• Made up of 2 transformations: 

From some (arbitrary) world coordinate system to the camera’s 3D 
coordinate system. Extrinisic parameters (camera pose) 

From the 3D coordinates in the camera frame to the 2D image 
plane via projection. Intrinisic paramters 
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Camera Pose 


In order to apply the camera model, objects in the scene 
must be expressed in camera coordinates. 



Calibration target looks tilted from camera 
viewpoint. This can be explained as a 
difference in coordinate systems. 
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Rigid Body Transformations 


Need a way to specify the six degrees-of-freedom of a 
rigid body. 

Why are their 6 DOF? 



A rigid body is a 
collection of points 
whose positions 
relative to each 
other can’t change 


Fix one point, 
three DOF 


Fix second point, 
two more DOF 
(must maintain 
distance constraint) 


Third point adds 
one more DOF, 
for rotation 
around line 


3 


+2 


+1 
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Notations (from F&P) 

• Superscript references coordinate frame 

• A P is coordinates of P in frame A 

• B P is coordinates of P in frame B 
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Translation Only 

b p = a p+ b (o a ) 

or 

b P= b (O a )+ a P 
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k 


B 



P 
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Translation 

• Using homogeneous coordinates, translation can be 
expressed as a matrix multiplication. 

B p= A p+ b O a 


1 

do 

i 


i 

i 

B O a ~ 

1 

> 

1 

i 

i 


i 

O 

1 

1 


Translation is commutative 



> to 


CS 4495 Computer Vision -A. Bobick 


Calibration and Projective Geometry 1 


Rotation 




cq <c 
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Example: Rotation about z axis 
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Combine 3 to get arbitrary rotation 


•Euler angles: Z, X’, Z” 

•Heading, pitch roll: world Z, new X, new Y 

Three basic matrices: order matters, but we’ll not focus on 
that 



cos {6) - sin(#) 

o' 


'l 

0 

0 

Rz(#) = 

sin(#) 

cos(#) 

0 

R x (</>) = 

0 

cos(^) - sin(^) 


0 

0 

1 


0 

sin(^) 

cos(^) 


cos(/c) 0 - sin(/c) 
0 10 
sin(/c) 0 cos(k ) 


Ry (* 0 = 
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Rotation in homogeneous coordinates 

• Using homogeneous coordinates, rotation can be 
expressed as a matrix multiplication. 

b p = Ir a p 


1 

to 

i 


> 

l 
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1 

Oh 

1 
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1 
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Rotation is not commutative 
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Rigid transformations 



b P= b R a P+ b O a 
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Rigid transformations (con't) 

• Unified treatment using homogeneous coordinates. 
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Invertible! 
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Translation and rotation 


From frame A to B: 

Non-homogeneous (“regular) coordinates 


Ho mogeneous coord i nate s 



B 
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3x3 

rotation 

matrix 


Homogenous 
coordinates allows us 
to write coordinate 
transforms as a 
single matrix! 
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From World to Camera 


Rotation from world Translation from 

to camera frame / world to camera frame 



Non- 

homogeneous 

coordinates 


Homogeneous 

coordinates 


From world to camera is the 
extrinsic parameter matrix (4x4) 

(sometimes 3x4 if using for next step in projection and not worrying about inversion) 
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Now from Camera 3D to Image. . . 
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Camera 3D (x,y,z) to 2D (u,v) or (x’,y’): 
Ideal intrinsic parameters 



z 
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Real intrinsic parameters (1) 



But “pixels” are in u u 
some arbitrary z 

spatial units y 

v = a — 

z 
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Maybe pixels are 
not square 


Z 
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X 

We don’t know the U = CC — I- 1/ 0 
origin of our Z 

camera pixel _ „ y 

coordinates V — p I- v 0 
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Really ugly intrinsic parameters (4) 



May be skew 

between camera x y 

pixel axes u — a a cot($) — f u Q 

z z 


v = 


P y 


sin(0) z 


— + v n 
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Intrinsic parameters, homogeneous coordinates 



Using homogenous coordinates we can write this as: 


f z*u ^ 
z*v 


V z J 


In homog 

pixels S S k p' — 


a 

0 

0 


-a cot(#) 

sin(#) 

0 


V n 




y 

z 

vly 


In camera-based 
homog 3D 
coords 


K 
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Kinder, gentler intrinsics 


Can use simpler notation for intrinsics - last column is 
zero: 


f 


s c 


K = 


0 

0 


af c y 
0 1 


s - skew 
a - aspect ratio 
(5 DOF) 


If square pixels, no skew, and optical center is in the 
center (assume origin in the middle): 


K = 


f 0 0 
0 f 0 
0 0 1 


In this case only one 
DOF, focal length f 
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Combining extrinsic and intrinsic calibration 
parameters, in homogeneous coordinates 



(If K is 3x4) 
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Other ways to write the same equation 

pixel coordinates 



p' = M w p 


f u l 


^ s*u^ 


V 

A 

s*v 

= 



V s y 



V 


m 

m 

m 


projectively similar 


world coordinates 


Conversion back from 
homogeneous 
coordinates leads to: 


\ 


V 


f w „ \ 
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P z 
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v = — ± 


“CU "Si 
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Finally: Camera parameters 

A camera (and its matrix) M (or II) is described by several parameters 

• Translation T of the optical center from the origin of world coords 

• Rotation R of the image plane 

• focal length f, principle point (x' c/ y' c ), pixel size (s x/ s y ) 

• blue parameters are called "extrinsics," red are "intri 

Projection equation 



sx 


^ 'I' ^ 

x = 

sy 

= 

* * * * 


s 


* * * * 


• The projection matrix models the cumulative effect of all parameters 

• Useful to decompose into a series of operations 

identity matrix 


intrinsics projection rotation translation 

• The definitions of these parameters are not completely standardized 
- especially intrinsics— varies from one book to another 


DoFs: 
5 + 0 + 3 + 3 = 

11 


M 


f S x 'c 

o af y\ 
n n i 


10 0 0 

0 10 0 

n n i n 


■^3x3 0 3xl 

[I T ] 

A 3x3 3x1 

_ 0.V3 1 _ 

_0 M 1 _ 
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Finally: Camera parameters 


Projection equation 

sx 


* * * * 

Y 


y ‘ 

) * V 

x = 

sy 
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>j< >j< >|i 

^ ^ ^ 
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• The projection matrix models the cumulative effect of all parameters 

• Useful to decompose into a series of operations 

identity matrix 


intrinsics projection rotation translation 


DoFs: 
5 + 0 + 3 + 3 = 

11 


M = 


f s x' c 
0 af y\ 
n n i 


10 0 0 
0 10 0 
n n i n 


^3x3 0 3xl 

n t i 

X 3x3 J -3xl 

_ ®lx3 1 . 

_o lx3 i _ 
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Calibration 

• How to determine M (or n in some texts)? 
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Calibration using a reference object 

• Place a known object in the scene 
• identify correspondence between image and scene 
compute mapping from scene to image 



Issues 

• must know geometry very accurately 

• must know 3D->2D correspondence 
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Resectioning - estimating the camera 
matrix from known 3D points 
• Projective Camera 
Matrix: 



• Only up to a scale, so 
11 DOFs. 
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Direct linear calibration - homogeneous 


U{ 


moo 

moi 

m 02 

m 03 

Vi 


m 10 

mn 

m 12 

™13 

1 


m 20 

m 2 i 

m 22 

m 23 


mooXj + m 01 Yi + m 02 ^ + m 03 
m 20 Xi + rri2iYi + m 22 Z; + m 23 
mioX, + mnYi + m^Zi + mi3 


m 20^i + m 2lYi + rn,22Zi + m 23 
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Direct linear calibration - homogeneous 


moo Xi + moiYj + ?7i 0 2^ + ™03 

= 

m 2 oXi + m 2 iYi + m 22 Zi + m 2 3 

mi 0 Xi + mnY T + mi 2 Zi + mi3 

v i = 

m 2 oXi + m 2 \Yi + m 22 Zi + m 2 2 

Ui(m 2 oXi + TO 2 iVj + TO22Z; + TO23) = mooX t + m 0 iYi + m 02 Zi + «703 
v i (m 2 oX i + rn 2 iY t + m 22 Zi + m 23 ) = rriio^i + mil^, + mi2^i + TO13 
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Direct linear calibration - homogeneous 


Ui(m 2 oXi + rri2iYi + m 2 2%i + m 23 ) = m 00 x i + m 01 Y i + + ^03 

v i( m 2oXi + m2iYi + m 22 Zj + m 22> ) = + mn^ + mi 2 Zi + mi3 


^oo 
mo i 
™02 
m 03 
777,10 


x, : v;- z, : i o o o o 


777,n 


" 0 ' 

0 0 0 0 X; Vj Z; 1 —V{Xj —VjY.j —u,Z, —Vi 


777 1 2 


0 


™13 



One pair of equations 

77720 

77721 



for each point 

77722 




77723 
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Direct linear calibration - homogeneous 
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10 
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0 
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This is a homogenous set of equations. 

When over constrained, defines a least squares problem 
- minimize ||Am|| 

• Since m is only defined up to scale, solve for unit vector m* 

• Solution: m* = eigenvector of A T A with smallest eigenvalue 

• Works with 6 or more points 
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The SVD (singular value decomposition) trick... 

Find the x that minimizes ||Ax|| subject to ||x|| = 1. 

Let A = UDV T (singular value decomposition, D diagonal, 

U and V orthogonal) 

Therefor minimizing ||UDV T x|| 

But, ||UDV r x|| = ||DV r x|| and ||x|| = ||V T x|| 

Thus minimize ||DV r x|| subject to ||V r x|| = 1 
Let y = V r x: Minimize ||Dy|| subject to ||y||=l. 

But D is diagonal, with decreasing values. So ||Dy|| min is when 
y = (0,0,0...,0,l) r 

Thus x = Vy is the last column in V. [ ortho: V T= V' 1 ] 

And, the singular values of A are square roots of the eigenvalues 
of A T A and the columns of V are the eigenvectors. (Show this?) 
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Direct linear calibration - inhomogeneous 

• Another approach: 1 in lower r.h. corner for 11 d.o.f 









~X~ 

u 


~ m oo 

m oi 

m 02 
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V 


m io 

m n 

m 12 

™13 
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1 








Now “regular” least squares since there is a non-variable 
term in the equations: 

mooXi + mQ\Yi + mQ2^i + ^03 

m 20 Xi + rri2iYi + rn 22 Z^ +W~ Dangerous if 

miQXi -|- tm\2^i + rni3 m 23 ** really 

™20 Xi + m 2 i^ + m 2 2^ +0 ZeW! 
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Direct linear calibration (transformation) 

• Advantage: 

• Very simple to formulate and solve. Can be done, say, on a 
problem set 

These methods are referred to as “algebraic error” minimization. 
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Direct linear calibration (transformation) 

• Disadvantages: 

Doesn’t directly tell you the camera parameters (more in a bit) 

Doesn’t model radial distortion 

Hard to impose constraints (e.g., known focal length) 

Doesn’t minimize the right error function 

For these reasons, nonlinear methods are preferred 

• Define error function E between projected 3D points and image 
positions 

- E is nonlinear function of intrinsics, extrinsics, radial distortion 

• Minimize E using nonlinear optimization techniques 

- e.g., variants of Newton’s method (e.g., Levenberg Marquart) 
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Geometric Error 



minimize E - 
min 

M 
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Predicted 

Image 

locations 


Z d (x ‘’ *') 

i 

^d(x’,MX t ) 
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“Gold Standard” algorithm (Hartley and Zisserman) 


Objective 

Given n>6 3D to 2D point correspondences {Xp-Kx/}, 
determine the “Maximum Likelihood Estimation” of M 
Algorithm 

(i) Linear solution: 

(a) (Optional) Normalization: = UX ( . Xj = Tx ( 

(b) Direct Linear Transformation Minimization of geometric 
error: using the linear estimate as a starting point minimize 
the geometric error: 

min Yj d ( x l’ MX i) 

i 

(ii) Denormalization: ]yj = X^MU 
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Finding the 3D Camera Center from P-matrix 

Formal way: 

• Slight change in notation. Let M = [Q | b] (3x4), b is last 
column of M 

• Null-space camera of projection matrix. Find C such that: 

MC = 0 

• Proof: Let X be somewhere between any point P and C 

X = XP + (l-X)C 

x = MX = /MP + (1 - l)MC 

For all P, all points on PC projects on image of P, 

• Therefore C the camera center has to be in null space 
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Finding the 3D Camera Center from P-matrix 

Easy way: 

• Again let M = [Q | b] (3x4), b is last column of M 

• Center can be found by: 



-Q *b 

v 1 


J 



CS 4495 Computer Vision -A. Bobick 


Calibration and Projective Geometry 1 



Alternative: multi-plane calibration 


Images courtesy Jean-Yves Bouguet, Intel Corp. 

Advantage 

• Only requires a plane 

• Don’t have to know positions/orientations 

• Good code available online! 

- Intel’s OpenCV library: http://www.intel.com/research/mrl/research/opencv/ 

- Matlab version by Jean-Yves Bouget: 

http://www.vision.caltech.edu/bouqueti/calib doc/index.html 

- Zhengyou Zhang’s web site: http://research.microsoft.com/~zhanq/Calib/ 



